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Abstract 

We develop a novel technique to identify active galactic nuclei (AGNs) and study the nature of low-luminosity 
AGNs in the Sloan Digital Sky Survey. This is the first part of a series of papers and we develop a new, sensitive 
method to identify AGNs in this paper. An emission line luminosity in a spectrum is a sum of a star formation 
component and an AGN component (if present). We demonstrate that an accurate estimate of the star formation 
component can be achieved by fitting model spectra, generated with a recent stellar population synthesis code, to a 
continuum spectrum. By comparing the observed total line luminosity with that attributed to star formation, we can 
tell whether a galaxy host an AGN or not. We compare our method with the commonly used emission line diagnostics 
proposed by Baldwin et al. (1981; hereafter BPT). Our method recovers the same star formation/ AGN classification 
as BPT for 85% of the strong emission line objects, which comprise 43% of our sample. A unique feature of our 
method is its sensitivity: it is applicable to 78% of the sample. We further make comparisons between our method 
and BPT using stacked spectra and selection in X-ray and radio wavelengths. We show that, while the method suffers 
from incompleteness and contamination as any AGN identification methods do, it is overall a sensitive method to 
identify AGNs. We emphasize that the method can be applied at high redshifts (up to z 1.7 with red-sensitive 
optical spectrograph) without making any a priori assumptions about host galaxy properties. Another unique feature 
of the method is that it allows us to subtract emission line luminosity due to star formation and extract intrinsic AGN 
luminosity. We will make a full use of these features to study the nature of low-luminosity AGNs in Paper-II. 
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1. Introduction 

Since the seminal work by Seyfert (1943) on broad emis- 
sion line properties of spiral nebulae, there has been a rapid 
advance in our understanding of active galactic nuclei (AGNs). 
A number of spectroscopic surveys of nuclei of nearby galax- 
ies have been performed since then (e.g., Heckman 1980; Ho 
et al. 1995) and large surveys of the local universe such as the 
2 degree field survey (Colless et al. 2003) and the Sloan Digital 
Sky Survey (SDSS; York et al. 2000) have characterized prop- 
erties of AGNs with an unprecedented statistical accuracy (e.g., 
Kauffmann et al. 2003; Heckman et al. 2004; Kewley et al. 
2006; Schawinski et al. 2007). The realization that black hole 
mass correlates with host galaxy properties such as bulge lu- 
minosity and mass (Kormendy & Richstone 1995; Magorrian 
et al. 1998; Ferrarese & Merritt 2000; Gebhardt et al. 2000) 
has triggered a lot of effort to link super-massive black hole 
growth with galaxy growth. Recent simulations of galaxy evo- 
lution seem to achieve some success in reproducing observed 
properties of galaxies by incorporating energy feedback from 
AGNs, although details of such a feedback mechanism are still 
fairly uncertain (Granato et al. 2004; Croton et al. 2006; Bower 
et al. 2006). 

In order to perform an observational study of effects of 
AGNs on galaxy evolution, one first has to identify AGNs. 



AGNs can be identified in a variety of ways at essentially aU 
wavelengths (X-rays, mid-IR, and radio detections and optical 
emission line diagnostics). The energy source of AGN activ- 
ity is the material accreting onto a central super-massive black 
hole that is heated to high temperature, traveling at high veloc- 
ity. It is thought to be the source of X-ray emission due to the 
inverse-Compton scattering of thermal photons from the accre- 
tion disk off relativistic electrons. The surrounding dust torus 
is heated by the radiation, thus resulting in strong thermal emis- 
sion at mid-IR wavelengths. In some cases, the central engine 
ejects jets perpendicular to the accretion disk and they are of- 
ten observed in radio wavelengths due to the Synchrotron radi- 
ation. Also, the central engine ionizes the gas in the galaxy out 
to hundred-parsec to kilo-parsec scales. The ionization state of 
the surrounding gas is often higher than normal star forming 
regions, showing characteristic emission line intensity ratios. 
AGNs exhibit these unique spectral features and AGN identifi- 
cation methods aim at detecting them. 

In this paper, we focus on optical emission line techniques. 
Baldwin et al. (1981) first presented a method to separate 
AGNs from star forming galaxies using flux ratios of four 
emission lines in the optical wavelengths (H/3, [oill]. Ha, 
and [Nil]). We refer to this diagnostics as BPT in what fol- 
lows. AGNs and star forming galaxies form distinct sequences 
with some overlap on the BPT diagram. Theoretical calcu- 
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lations based on photo-ionization models have been made to 
understand the distribution on this diagram (e.g., Ho et al. 
1993; Kewley et al. 2001; Stasihska et al. 2006). For instance, 
Kewley et al. (2001) performed detailed photo-ionization mod- 
eling over a range of ionization parameters and defined a region 
of the diagram where photo-ionization is unlikely due to stars. 
Massive spectroscopic surveys of the local universe such as the 
Sloan Digital Sky Survey (SDSS; York et al. 2000) revealed 
that star forming galaxies form a tight sequence on the BPT 
diagram, suggesting that the ionization state of star forming 
galaxies does not vary so widely as explored by Kewley et al. 
(2001). Kauffmann et al. (2003) empirically revised the theo- 
retical curve of Kewley et al. (2001) using the data from SDSS 
and this is now a commonly used discriminator between AGNs 
and star forming galaxies. 

Despite the popularity, however, the BPT diagnostics has 
disadvantages. Firstly, it requires four emission lines, which 
are not always easy to measure with sufficient signal-to-noise 
ratios. In particular, H/3 can be very weak in low-luminosity 
AGNs and it often limits the sensitivity of BPT. Secondly, it re- 
quires Ha and [Nil]. These lines migrate to near-IR at z > 0.5 
and it is hard to measure these lines at high redshifts. Attempts 
have been made to overcome these issues by using emission 
lines that are observable even at z > 0.5 or by making a priori 
assumption of host galaxy properties. Rola et al. (1997) sug- 
gested that [Oil] and [NEIII]AA3869,3968 could help identify 
AGNs when Ha and [Nil] are not available. Lamareille et al. 
(2004) showed that [oil]/H/3 can be used in place of [Nll]/Ha. 
Yan et al. (2006) and Yan et al. (201 la) proposed ways to iden- 
tify AGNs using a priori assumption about host galaxy proper- 
ties. Recently, Juneau et al. (201 1) presented a new diagnostics 
using [Olll]/H/3 and stellar mass of the hosts. In this paper, we 
make an attempt to overcome the issues with a new, physically 
motivated method to identity AGNs. In particular, we do not 
assume any host galaxy properties a priori as done in Yan et al. 
(2006, 201 1) and Juneau et al. (201 1) to identify AGNs. This is 
essential to study relationships between the AGN activity and 
host galaxy properties. 

The structure of this paper is as follows. We develop a 
new method to identify AGNs in Section 2, followed by ex- 
tensive tests of the method in Section 3. We summarize the 
strengths and weaknesses of the method and conclude the pa- 
per in Section 4. Nature of low-luminosity AGNs and their 
host galaxy properties will be presented in Paper-II. Unless 
otherwise stated, we adopt ~ 0.3, fl\ — 0.7, and Hg = 
70 km Mpc~^. All the magnitudes are given in the AB 
system. We use the following abbreviations : AGN for active 
galactic nucleus, BPT for the Baldwin et al. (1981) diagnos- 
tics, SF for star formation, and SFR for star formation rate. 
Emission lines used in this work include [Oil] AA3726,3729, 
up A4861, [oiii] A5007, [oi] A6300, Ha A6563, [Nil] A6583, 
and [Sii] A6716,6730. 

2. A new method 

As mentioned in the last section, the commonly used emis- 
sion line diagnostics involves intensity ratios of emission lines 
to identify a signature of AGN. An ionizing spectrum of AGN 
is typically harder than spectra of young stars, and thus AGNs 



exhibit characteristic emission line intensity ratios. The most 
commonly adopted Baldwin et al. (1981) diagnostics involves 
four emission lines, which are not always easy to measure at 
high signal-to-noise. That hinders efficient identification of 
low-luminosity AGNs in surveys such as SDSS. However, one 
does not necessarily have to rely on ratios of emission lines. A 
single emission line in principle contains information about an 
underlying AGN. If a galaxy hosts an AGN, the emission line 
luminosity we observe originates both from star formation and 
AGN: 

Lmeasured = LsF + LaGN, (1) 

where Lsf is an emission line luminosity due to star forma- 
tion and Lagn is a luminosity due to AGN. The idea behind 
our method is to estimate Lgi? of a galaxy and compare it 
with Lmeasured- If wc obscrvc a significant luminosity ex- 
cess in Lmeasured, it mcans that the galaxy shows a significant 
Lagn and it likely hosts an AGN. In this section, we develop 
a method to estimate Lsf and quantify how accurate our Lsf 
is. Then we move on to perform an extensive test of our AGN 
identification method in the next section. 

2.L Sloan Digital Sky Survey 

In this paper, we use data from the Sloan Digital Sky 
Survey Data Release 7 (Abazajian et al. 2009). The SDSS uti- 
lizes a dedicated 2.5m telescope installed at the Apache Point 
Observatory (Gunn et al. 2006) and the survey is in two parts: 
imaging and spectroscopy. The SDSS has imaged a quar- 
ter of the sky in five photometric bands (urgiz; Fukugita et 
al. 1996; Gunn et al. 1998; Doi et al. 2010) with unprece- 
dented accuracy (Ivezic et al. 2007; Padmanabhan et al. 2008). 
The SDSS spectroscopic survey utilizes double fiber-fed spec- 
trographs and obtains 640 spectra simultaneously covering a 
wavelength range of 3800A to 9200A with a resolving power of 
R ~ 2000. Each fiber subtends 3" on the sky. The survey con- 
sists of 3 major components : main galaxy sample (Strauss et 
al. 2002), luminous red galaxy sample (Eisenstein et al. 2001), 
and QSO sample (Richards et al. 2002). The main sample is a 
flux-limited sample down to r = 17.77 selected from the imag- 
ing survey and we use objects in the main sample in this paper 

We apply the following criteria to select galaxies for our 
study: SPECClass=2 (i.e., objects are galaxies) located at 
0.02 < z < 0.10 with high confidence flags (zCONF>0.8 and 
zWarning=0). We intentionafly remove QSO-Hke objects 
(SPECClass=3) from the sample because our method is not 
applicable to those objects (as shown below, our method as- 
sumes that a continuum spectrum is dominated by stars, not by 
AGN). We have 283,031 objects in total, a quarter of which 
are identified as AGNs by the method developed in this paper 
We correct the SDSS spectra for the Galactic extinction using 
the extinction curve of Cardelli et al. (1989) and the extinction 
map from Schlegel et al. (1998). 

2.2. Spectral fitting 

How do we estimate Lsf of an emission line? It has actually 
been a long standing issue in AGN studies. AGN emission is 
contaminated with star formation emission and that has often 
hindered detailed studies of intrinsic AGN output. We cannot 
use any emission lines to estimate Lsf because it is hard to 
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discriminate it from Lagn- We have to rely on other available 
information. One might use an star formation indicator such 
as mid-IR emission to measure it, but one faces the same prob- 
lem - it is difficult to disentangle Lsf and Lagn in mid-IR. 
We take a novel approach to solve the problem. AGNs show 
strong emission lines, but their continuum emission is usually 
very weak in optical wavelengths, except for very strong AGNs 
and quasars (Binette et al. 1994). We use optical continuum 
emission of galaxies, which is dominated by stellar light, not 
by AGNs, to estimate Lsf- We fit observed spectra of galaxies 
with model spectral templates of galaxies with various star for- 
mation histories to obtain star formation rates (SFRs) and dust 
extinction. From these numbers, we can work out Lsf- 

We generate model templates using an updated version of 
the Bruzual & Chariot (2003) code with improved treatment of 
thermally pulsating AGB stars. Free parameters in the models 
are 

• Star formation history: We assume a simple, exponen- 
tially decaying star formation rate to describe star for- 
mation histories. The exponential time scale is allowed 
to vary between (i.e., instantaneous burst) and c» (i.e., 
constant star formation rate). 

• Dust extinction: We use the two component extinction 
model of Chariot & Fall (2000). We adopt /i = 0.3, which 
means that 30% of the extinction is due to the ambient 
interstellar medium, which affects all stars. The remain- 
ing 70% is due to dust in star forming regions, and it 
affects only stars younger than 10^ yr We modify the 
extinction curve of t oc A^°-^ to that of Cardelli et al. 
(1989), which is close to r oc A^^°. We justify the choice 
of the Cardelli et al. (1989) curve over the Chariot & 
Fall (2000) curve in the Appendix. We allow the optical 
depth in the T^-band, ry, to vary between (i.e., no dust) 
and 3. 

• Metallicity: We use the solar metallicity models only. If 
we include super-solar and sub-solar metallicity models, 
we introduce too much degeneracies between age, metal- 
licity, and dust and degrade the fits. We justify the exclu- 
sion of non-solar metallicity models in the Appendix. 

• Age: We apply a logical constraint that the age of an 
model template must be younger than the age of the uni- 
verse at a given redshift. We do not use models with 
young ages with < 1 Gyr Due to the degeneracies be- 
tween the above mentioned parameters, we can fit galax- 
ies with very young models, but they often give inaccu- 
rate SFRs. The age limit of 1 Gyr removes most of such 
bad fits. 

We assume the Chabrier initial mass function (Chabrier 2003). 
We fit the observed spectra of galaxies with these templates us- 
ing the statistics. We generate sets of the templates with 
varying stellar velocity dispersions ranging from 75 to 250 
km s^^ with a 25km s^^ step and fit the galaxies with the 
closest dispersion. We have ~ 8,400 model templates in each 
set. In the fitting, we mask out regions around strong emission 
lines such as [oil], H/3, [Olll], Ha, [Nil], [sil], etc, because 
we want to fit the spectra of stars. In addition, we mask out a 
region around 5577A, where a strong night-sky Oxygen line is 
located. The best-fitting models give SFRs, stellar mass, and 



TV, which we will extensively use in our analysis. We derive an 
uncertainty on each parameter by taking Ax^ = 1 from the best 
fit. However, due to correlations between adjacent wavelength 
points of the SDSS spectra and also to strong degeneracies be- 
tween the model parameters, the derived errors may not be ac- 
curate. We can empirically measure the errors in, e.g., Lsf 
by comparing those from direct emission line measurements as 
shown below. 

We subtract the best-fitting model spectra from the ob- 
served spectra to obtain continuum-subtracted spectra to mea- 
sure emission line fluxes. As summarized by Tojeiro et al. 
(201 1), population synthesis models are not always perfect and 
they under/over subtract the continuum in some wavelength re- 
gions. For example, in the case of Bruzual & Chariot (2003) 
model, it often over-subtracts the continuum around H/? (Asari 
et al. 2007). We therefore remove the residual continuum by 
median-filtering the spectra within a running boxof AA = 60A. 
We then simply sum the fluxes around an emission line to mea- 
sure its flux within a wavelength range of |A — Xiine\ < 8 A. 
One could fit Gaussian to an emission line to measure the flux 
(e.g., Tremonti et al. 2004), but AGNs may well show narrow 
and broad components simultaneously. As AGNs are the focus 
of this work, we do not assume any line profiles. A fraction of 
galaxy light is missed from the 3 arcsec fiber. In order to cor- 
rect for the missing light, we compute the slit loss by compar- 
ing 7'-band magnitude synthesized from the spectrum with the 
r-band Petrosian magnitude from imaging. The stellar masses 
and SFRs of the host galaxies are corrected for the slit losses 
and are indicated with a subscript apercorr in figures. Note that 
this is only a first-order correction because of the assumption 
employed here that the light in the fiber is representative of the 
entire galaxy light. 

Strong AGNs often exhibit featureless continuum in the UV 
(e.g., Kinney et al. 1991), which might affect our spectral fit- 
ting and the resultant parameters. We have checked effects of 
such featureless continuum by including additional continuum 
flux in the form of oc v^^-^ (Schmitt et al. 1999) to the stel- 
lar spectra generated with the population synthesis code. The 
strength of the featureless continuum ranges from to 30% of 
an observed spectrum at 5500A with a step of 5%. We find 
that such a continuum decreases the accuracy of our spectral 
fits. For example, an accuracy of the predicted [Oll]H-[Olll] 
luminosity decreases to 0.35 dex (we obtain 0.24 dex without 
the continuum as shown below). Furthermore, the best-fitting 
models often give strong featureless continuum to star forming 
galaxies selected from BPT (i.e., non-AGNs). These results 
suggest that such a continuum just increases the degeneracies 
between the model parameters and does not improve the fits. It 
probably makes sense to include the featureless continuum in 
the fits to study strong AGNs, but for our purpose of studying 
low-luminosity AGNs, we choose not to include it. The so- 
called 'big blue bump' seems to disappear in low-luminosity 
AGNs (Eracleous et al. 2010a). This observation adds further 
motivation not to include the featureless continuum in the fits. 

We admit that there is room for improvements in our spectral 
fitting. First, we do not use any priors in the fitting. We may ob- 
tain better fits if we use priors on correlations between param- 
eters (e.g., one can assume a broad correlation between dust, 
SFRs, and stellar mass), although it is not very straightforward 
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to set priors at high redshifts, where we actually would like to 
apply the method in the future. Also, one can model more re- 
alistic star formation histories by including a secondary burst. 
Despite the simplicity, however, our models deliver good esti- 
mates of SFRs and emission line luminosities as shown below. 

2.3. Accuracy of SFRs and extinction fmm the spectral fits 
and the predicted Ls f 

Using the SFRs and dust extinction measurements derived 
from the model fits, we can now estimate Lsf- But, first of all, 
we shall quantify how accurate our estimate of SFRs and tv 
from the model fits are. In Fig. 1, we compare SFRs from the 
spectral fits and those from Ha corrected for extinction using 
the balmer decrement. For the purpose of quantifying the ac- 
curacy of LsF^ we remove AGNs using the BPT diagram with 
the Kauffmann et al. (2003) threshold from the figure for now. 
We use galaxies with all the H/3, [Olll], Ha and [Nil] fines 
detected at > 3<t here. 

We obtain reasonably good estimates of SFRs from the spec- 
tral fits, although there is a tilt there. We apply a biweight fit 
and find a log-linear slope of 0.63 and the scatter around the 
solid line to be only 0.23 dex (a factor of 1.7). As discussed 
in the appendix, the tilt and the mean offset becomes slightly 
larger if we use the original r oc A~"^ law of Chariot & Fall 
(2000). Note as well that we obtain a better correlation with a 
log-linear slope of 0.8 if we do not correct for the dust using 
the Balmer decrement. We will extensively use SFRs from the 
spectral fits, and we could empirically correct for the observed 
tilt to obtain more precise SFRs. But, we choose not to do so 
because it does not change our conclusions in the paper. We try 
not to use empirical calibrations of the outputs from the spec- 
tral fitting throughout the paper as long as they do not affect 
our main conclusions. 

While we can estimate SFRs reasonably well, our dust es- 
timates are not as good as we hoped for. The two component 
dust model of Chariot & Fall (2000) does not seem to work 
very well and our dust measurements are almost always smaller 
than those from the Balmer decrements particularly at low Ty. 
The median difference between the two extinction estimates is 

Tv, specfit — Ha/H[3 = —0.98. 

From SFR and Ty , we can work out Lsf- In case of Ha, we 
use the following equation to derive it: 

LHa, SF = SFR/{7.9 X IQ-^Vl.T) x exp(-0.75Ty) (2) 

The conversion factor from SFR to Ha flux is from Kennicutt 
(1998) and the factor of 1 .7 is applied to change the initial mass 
function from Salpeter (which is assumed in Kennicutt 1998) 
to Chabrier (Asari et al. 2007). Extinction at the wavelength of 
Ha is O.lbTy. Fig. 2 compares the predicted Ha luminosity 
from the spectral fits with the measured luminosity. The figure 
shows that we can make a fairly accurate (a dispersion of 0.16 
dex or a factor of 1.4) prediction of Ha luminosity from the 
continuum fit. This good correlation might appear surprising 
given the poor dust estimates, but it is due to the degenera- 
cies between SFR and dust — slightly underestimated SFRs 
and underestimated dust extinction nearly cancel out and give 
us good emission line luminosity estimates. We perform a bi- 
weight fit to the data and obtain a log-linear slope of 0.80. This 
is a relatively small tilt and and the accuracy of the predicted 
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Fig. 2. Predicted Ha luminosity from the spectral fits plotted against 
measured Ha luminosity. The slope of the log-linear fit and dispersion 
around it are shown in the plot. 

luminosity is encouraging. 

We emphasize that Fig. 2 includes all the effects from ob- 
servations and model fits. As discussed above, our models may 
not be the most realistic models and there is room to improve 
the fitting procedure, but we can still make a fairly good pre- 
diction of Lsf- Continuum may provide SFRs on different 
time scales from those from Ha. Ha comes from star forming 
regions and is a good probe of instantaneous SFRs, while con- 
tinuum may provide SFRs smoothed over an extended period. 
The scatter we observe in Fig. 2 is perhaps partly due to the 
different time scales probed. But, including all these effects, 
we obtain a remarkable accuracy with only a small tilt. We 
find that this scatter does not significantly reduce if we use Ha 
measured at > IOct only, suggesting that the scatter is primarily 
due to scatter in Lho.sf- The errors in the physical parameters 
from the spectral fits may not be accurate due to correlations 
between the adjacent wavelength points and to model degen- 
eracies as mentioned above, but the observed dispersion gives 
us a good quantitative estimate of an error in LHa,SF, which 
is a factor of 1 .4 with a small tilt. 

In addition to the Balmer lines, we make an extensive use 
of [on] and [oill] lines. But, to derive [oil] and [Olll], we 
need an empirical calibration because these lines have strong 
dependence on metallicity, but we cannot estimate metallicity 
of galaxies from the spectral fits as discussed in the appendix. 
We fit a relationship between observed flux ratios of [oil] or 
[oill] to Ha and stellar mass as shown in Fig. 3. The idea 
is to use stellar mass as a proxy of metallicity given the tight 
mass-metallicity relation (Tremonti et al. 2004). Using these 
empirical calibrations, we can now predict [oil] and [Olll] lu- 
minosity due to star formation (i.e., Lsf) as we know the stel- 
lar mass and Ha luminosity from the spectral fits. We find that 
the predicted [oil] and [oill] luminosities are slightly tilted 
with respect to the observed luminosities (a log-linear slope of 
^ 0.8). The tilt does not strongly affect our results in this pa- 
per and those in Paper-II and for simplicity we assume that the 
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Fig. 1. Left: SFRs from spectral fits plotted against SFRs from Ha corrected for the dust extinction using the balmer decrement assuming the extinction 
law of Cardelli et al. (1989) and intrinsic Ha/H/3 ratio of 2.86 (Osterbrock & Ferland 2006). We plot every 5 objects for claiity. The sohd line shows 
a log-linear fit to the data and the slope and dispersion around the fit are shown in the plot. Note that AGNs identified with BPT are removed from the 
plot. Right: Ty from the spectral fits plotted against ry from the balmer decrement. The points and eiTor bars show the median and 25th-75th percentile 
interval (they are shown only at Ty,spec/it bins with more than 100 galaxies). 




Fig. 3. Left: Observed [Oil] to Ho flux ratio plotted against stellar mass. The contours enclose 5%, 25%, 50%, 75% and 95% of the galaxies. The 
points and eiTor bars show the median and quartile of the distribution in each stellar mass bin. The curve shows the fitted relation. Right: Same as the 
left panel, but [OIII] to Ha flux ratio is taken in the vertical axis. 
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Fig. 5. Predicted [OII] + [OIII] luminosities due to star formation plot- 
ted against measured luminosities. The contours show star forming 
galaxies and the dots show AGNs selected from the BPT diagram. 
Only every 5 AGNs are plotted for clarity. The dotted line shows 
the threshold of Oxygen-excess and galaxies below it are defined as 
Oxygen-excess galaxies. 

predicted and observed luminosities of all the emission lines 
are correlated with a log-linear slope of 1 with a small constant 
offset. 

2.4. Line of choice: Ha, [Oil], or [OIIl] ? 

Now we have developed a technique to predict Lsf for the 
strong lines such as [Oil], [Olll], and Ha. Which line(s) should 
we use to identify AGNs? We compare these three lines in 
Fig. 4. We define AGNs using the BPT diagnostics with the 
Kauffmann et al. (2003) threshold just as a guide line for now 
and show them with dots in the figures. Interestingly, the col- 
lisionally excited Oxygen lines of AGN hosting galaxies show 
a clear offset with respect to star forming galaxies, while Hq 
line shows only a small offset. This is what expected from the 
BPT diagram, which clearly shows enhanced strengths of col- 
lisionally excited lines compared to the Balmer lines in AGNs 
(AGNs form a sequence towards the top-right corner of the 
BPT diagram, not to the bottom-left corner). In this work, we 
shall use [oil] and/or [Olll] for our purpose of AGN identifi- 
cation. 

As we will show in Paper-II, the ionization state of the nar- 
row line regions in AGNs spans a wide range. Given this wide 
range of ionization, we deem that a sum of [oil] and [Olll] 
is likely a better indicator of AGN activities than either one 
of them because these two lines balance each other in gaseous 
nebulae and the sum of them is more robust against variations 
in the ionization states. We therefore use the sum of the two 
Oxygen lines to identify and to characterize AGNs. 

Using the technique developed above, we can predict the 
[Oll]-i-[Olll] luminosities fairly well as shown in Fig. 5. We 
define Oxygen-excess galaxies as those with 



log'lO ^[0//] + [0///].o//set 

^[OII] + [OIII],SF 

> 1.5 X logio(T(L[o//]+[o///],SF), (3) 

where -Zj[o//]+[o///],o/7set is a systematic offset between 
the predicted luminosity and measured luminosity for star 
forming galaxies (+0.16 dex as shown in Fig. 5) and 
'^{L^oii\+[Oiii].Sf) is an accuracy of our luminosity predic- 
tions (0.24 dex). Our selection criterion is that if an object has 
an [Oll]-i-[oill] luminosity that exceeds the expected luminos- 
ity due to star formation by > 1.5cr, it is an Oxygen-excess 
object. This sigma cut is a trade off between completeness and 
contamination (e.g., if we reduce the threshold to Icr, we have a 
better sampling of BPT AGNs at the cost of increased contam- 
ination of star forming galaxies). Our choice of 1.5ct is simply 
a compromise between them, but we have confirmed that our 
results do not significantly change if we change it to la or 2a. 
The adopted threshold is shown as the dotted line in Fig. 5. We 
further require a significant detection of [oil]-i-[oill] at > Scr 
to ensure that we do not suffer from noises. Since the idea is 
to identify Oxygen emission line excess, we dub our method 
"Oxygen-excess method". 

As a quick check of the Oxygen-excess method developed 
here, we plot in Fig. 6 the distribution of the Oxygen-excess 
galaxies on the BPT diagram. 43% of galaxies in our sample 
show strong enough emission lines to apply the BPT diagnos- 
tics and are plotted as the contours in the figure. Among these 
strong emission line objects, we find that our SF/AGN clas- 
sifications agree with BPT for 85% of objects. The Oxygen- 
excess objects are mostly (75%) in the AGN region of the BPT 
diagram. The rest of them are in the SF region, but we note that 
the SF/AGN threshold of Kauffmann et al. (2003) is arbitrary 
defined. If we use the criterion proposed by Stasinska et al. 
(2006), the fraction increases to 87%. 

The right plot of Fig. 6 shows the distribution of projected 
distances of the galaxies to the threshold curve. Galaxies dis- 
tribute contiguously around the (arbitrary set) threshold. This 
continuous sequence from star forming galaxies to AGNs prob- 
ably represents a wide range of AGN activities with respect to 
underlying star formation. The BPT diagnostics misses weak 
AGNs in actively star forming galaxies (so does our method 
but with a improved sensitivity to weak AGNs; see the next 
section and Paper-II) and it would not be surprising at all if 
some of the galaxies in the star forming region of the dia- 
gram actually host AGNs. In fact, the Oxygen-excess ob- 
jects in the star forming region of the diagram are skewed to- 
wards distance ^ 0, while normal star forming galaxies form 
a peak around distance ^ —0.15. If these Oxygen-excess ob- 
jects were pure contamination, we would have seen a peak at 
distance ^ —0.15 with an extended tail to distance ^ —0.5. 
The skewed distribution of the Oxygen-excess objects suggests 
that they are not pure contamination of star forming galaxies. 
We will make further attempts to characterize our method in 
the next section. 

In addition to the most commonly used BPT diagnostics, we 
also present [0III]/H^ vs. [Ol]/Ha and [oill]/H/3 vs. [Sll]/Ha 
diagrams (Veilleux & Osterbrock 1987) in the Appendix for 
completeness. As pointed out by earlier papers (e.g., Kewley 
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Fig. 4. Predicted line luminosities due to star formation plotted against measured line luminosities. The panels show, [Oil], [OIII], and Ha from left 
to right. We show objects with significant line detections (> 3cr). In each panel, the contours show star forming galaxies and the dots show AGNs 
selected from the BPT diagram (plotted every five objects). The dashed lines show Lsf = Lmeasured- An offset from the dashed line and dispersion 
in log(Lsp I J-'measured) E^ch panel are for star forming galaxies. 




Distance from the BPT curve 



Fig. 6. Left: BPT diagram. The contours encircle 5%, 25%, 50%, 75% and 95% of all the galaxies used here (i.e., 0.02 < ^ < 0.1 and all the four 
lines detected at > 3o"). The dots show Oxygen-excess objects (plotted every five objects). The dashed curves is from Kauffmann et al. (2003). Just for 
a reference, we also show the Kewley et al. (2001) threshold as the dotted curve. The cross on the bottom-right shows the median error in the line ratios. 
Right: The distiibution of 'distance' of galaxies from the Kauffmann et al. (2003) curve in the left panel. The negative distance means that galaxies are 
in the star forming region, while the positive distance means that galaxies are in the AGN region. The vertical dashed fine is the Kauffmann et al. (2003) 
threshold. The open and filled histograms show all the galaxies and Oxygen-excess galaxies. 



et al. 2006; Stasinska et al. 2006), the separation between star 
formation and AGN sequences is less clear if we use [oi] or 
[Sll] in place of [Nil]. [Nil] seems to work better because the 
[Nll]/Ha ratio saturates at high metallicities (Kewley & Dopita 
2002) due likely to its secondary nature (Kewley & Dopita 
2002; Stasinska et al. 2006). But, the [Ol] and [sil] lines are 
interesting in their own right. For instance, the [oi] line comes 
from partially ionized nebulae and hence it is sensitive to the 
hardness of ionizing spectrum and is a good probe of AGNs. 
Readers are refereed to Appendix for further discussions. 

Finally, we shall emphasize that the Oxygen-excess method 
can be applied at much higher redshifts compared to BPT. 
Although [oil]-i-[Olll] is the most effective set of lines, one can 
in principle use any line to identify an emission line excess. A 
practical application of the method would be to use [oil] only. 
One has to take a risk of missing high ionization AGNs that 
exhibit weak [oil], but the gain is that one can go up to z ^ 1.7 
with red-sensitive optical spectrograph to study the AGN evo- 



lution over a wide redshift range. Our method will be an ideal 
method to identify AGNs and study their host galaxy properties 
in on-going/near-future massive spectroscopic surveys such as 
SDSS-III Baryon Oscillation Spectroscopic Survey. 

3. Comparisons between Oxygen-excess, BPT, X-ray and 
radio sources 

Following the development of the Oxygen-excess method, 
we make extensive tests of the method by comparing with other 
AGN detection methods in this section. First, we further com- 
pare with the BPT diagnostics. A significant fraction of galax- 
ies in our sample (43%) show too weak emission lines to ap- 
ply the BPT diagnostics, but they have strong enough Oxygen 
lines to apply the Oxygen-excess method. We test how well our 
method works in such weak emission line galaxies by stacking 
spectra. The stacked spectra are also useful to characterize av- 
erage properties of various classes of objects. We then compare 



8 



A Novel Method to Identify AGNs 



[Vol. 



with X-ray sources identified in archival Chandra observations 
and also with radio sources from FIRST. 

Before we present our results, it is important to emphasize 
that none of the Oxygen-excess, BPT, X-ray, and radio meth- 
ods is a perfect method to identify AGNs. Each method has 
pros and cons and they all suffer from incompleteness and con- 
tamination. The most relevant numbers to quote in this section 
would be fractions of missing AGNs and contaminating non- 
AGNs in the Oxygen-excess objects. But, we are unable to 
provide these numbers because no AGN identification method 
gives a complete sampling of AGNs. Nonetheless, we make an 
attempt to quantify whether a majority of the Oxygen-excess 
are real AGNs or not. Note that we cannot reach any clear con- 
clusion in intermediate types of objects that are fundamentally 
difficult to classify (see Ho 2008 for a review of the subject). 
Also, we will discuss objects that are photo-ionized by non- 
AGN sources in section 3.4. 

3.1. Stacked objects on the BPT diagram 

We define classes of galaxy populations in Table 1 to com- 
pare the Oxygen-excess objects with the BPT objects. The no- 
tation is defined as 'O' and 'B' stand for Oxygen-excess and 
BPT. 'H-' and mean AGN and SF. For example, Oh-B- are 
the objects that are identified as AGN by the Oxygen-excess 
method, but are classified as SF by BPT. We use an 'n' for ob- 
jects we cannot apply the Oxygen-excess or BPT method due to 
weak emissions. Note that we use the threshold by Kauffmann 
et al. (2003) to define AGN and SF on the BPT diagram. 

In our spectral fitting described in the last section, we sub- 
tract the continuum using the best-fitting model template and 
further by applying the median filter to remove the residuals. 
We stack these continuum subtracted spectra in each class us- 
ing the inverse-variance weights to make the average emission 
line spectra. We also perform the median stacking in addition 
to the inverse-variance stacking. The emission line strengths in 
the stacked spectra are somewhat different between these two 
stacking techniques, but the line ratios, which we will soon dis- 
cuss, are not very different. Note that we combine the spectra 
in apparent flux density, not in distance corrected luminosity 
density (i.e., only the wavelengths are corrected to rest-frame) 
because the AGN/SF classification is limited by the observa- 
tional flux limit (this is especially relevant to O-nBn, O-Bn, and 
OnBn classes) and we would like to show typical observed flux 
densities of objects in these classes and compare them with 
stronger emission line objects. We have confirmed that the line 
ratios we discuss below are essentially the same regardless of 
whether we correct for the distance or not. 

Fig. 7 presents the stacked spectra of all classes. Note the 
very high quality of the stacked spectra. A typical emission 
line fluxes in the OnBn class is comparable to the typical noise 
level of the SDSS spectra. The stacked spectra are of sufficient 
signal-to-noise to measure emission line fluxes for all class of 
objects. We can now measure the line ratios and study their 
average properties. O-nBn and O-Bn will be particularly inter- 
esting because we cannot apply the BPT diagnostics to these 
galaxies individually. We show locations of the stacked objects 
on the BPT diagram in Fig. 8 and discuss each class of ob- 
jects below. The numbers in the parenthesis are the fractions of 
objects in that class to the entire sample. 



• Oh-Bh- (6.3%): We observe strong emission lines in the 
stacked spectra. The fine flux ratios on the BPT diagram 
indicate that the galaxies host AGNs. 

• Oh-B- (2.3%): This is an interesting class of objects. 
The galaxies are classified as star forming galaxies on 
the BPT diagram, but we observe an Oxygen flux excess. 
The stacked spectrum ranges from the middle of the star 
forming sequence to the threshold line of Kauffmann et 
al. (2003). The observed offset to the threshold com- 
pared to B-0-, which are nearly pure star forming galax- 
ies (see below), suggests that these galaxies may harbor 
weak AGNs with underlying active star formation. But, 
the BPT diagram does not give us an estimate of AGNs 
and contaminating star forming galaxies. This is one of 
the difficult classes to characterize. 

• O-B-H (4.5%): The galaxies are defined as AGNs from 
the BPT diagram, but we do not observe a significant 
Oxygen flux excess. The stacked spectra and the location 
on the BPT diagram show that these galaxies likely host 
AGNs with active underlying star formation. The X-ray 
analysis below also suggests that they are likely AGNs. 
The majority of the galaxies in this class are AGNs and 
our method misses them. It could be that we miss them 
due to statistical fluctuations of our flux predictions (the 
scatter is a factor of 1.7; Fig. 5). But, it could also 
be that AGN continuum gives a non-negligible contribu- 
tion to the overall continuum spectra. AGN continuum 
is likely a power-law form in the optical wavelengths, 
making the spectra bluer Although such AGN contin- 
uum is typically fairly weak in weak AGNs (Schmitt et 
al. 1999), we may over-estimate SFRs if AGN contin- 
uum is happen to be strong. As a result, we may miss 
AGNs. As mentioned earlier, we cannot measure the 
amount of such continuum contribution with our spec- 
tral fits well. GALEX photometry may give us an in- 
sight into AGN continuum, but our poor estimates of 
dust extinction would not allow us to study near-far UV 
luminosities because of the strong sensitivity to dust. A 
more sophisticated spectral fitting would be needed. We 
characterize the host galaxy properties of O-Bh- in the 
Appendix and show that this missing AGN population 
does not affect our conclusions in Paper-II. 

• O-B- (29.4%): These objects are in the star forming 
region on the BPT diagram and we do not observe an 
Oxygen flux excess. The stacked spectrum is indeed in 
the middle of the star forming region of the BPT dia- 
gram. Therefore, these objects are likely star forming 
galaxies. 

We can apply the BPT diagnostics to all the objects dis- 
cussed so far'. We shall emphasize that the SF/AGN classi- 
fication by the Oxygen-excess method is consistent with BPT 
for 85% of the objects (Oh-Bh- and 0-B-). Intermediate cases 
(O+B- and 0-B+) are somewhat challenging to fully charac- 
terize, but they make up only 15% of the strong emission line 
objects. We also emphasize that we could apply the BPT di- 

' If we use the Kewley et al. (2001) threshold to define AGN/SF, the frac- 
tions in the above classes are 2.5%, 5.8%, 0.3%, 32.5% for 0-I-B-I-, 0-I-B-, 
0-B+, and 0-B-, respectively. 
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Fig. 7. Stacked spectra. From top to bottom, the spectra are for 0+B+, 0+B-, 0-B+, 0-B-, O+Bn, O-Bn, and OnBn, respectively. Most prominent 
emission lines are labeled. The numbers of objects used for the stacking are shown as well. 
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Table 1. Galaxy population classes. 

BPT-AGN BPT-SF 4 lines unavailable 

Oxygen excess O + B+ O + B- O + Bn 

No Oxygen excess O — B+ O — B— O — Bn 

Oxygen unavailable — — OnBn 
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Fig. 8. BPT diagram. The contours show galaxies with strong emission Unes as in Fig. 6. The locations of the stacked objects are indicated by the cu'cles 
and arrows. The filled and open circles are measured from the inverse-variance weighted stacking and from the median stacking, respectively. 



agnostics only to 43% of objects in our sample. Although we 
define our sample at 0.02 < z < 0.10, which is below the me- 
dian redshift of the Main galaxy sample (z ^ 0.1), more than 
a half of the objects remain unclassified. This is where our 
method has a great advantage — our method can be applied to 
nearly twice as many objects as BPT. We discuss these weak 
emission line objects below. 

• OH-Bn (17.2%): The galaxies in this class do not show 
strong enough lines to apply the BPT diagnostics, but 
their [oil]-i-[oill] lines are strong enough to apply our 
method. A large fraction of galaxies in the O-i-Bn and 
O-Bn classes demonstrates the sensitivity of our method 
to low-luminosity objects. The emission lines in the 
stacked spectrum are weak as expected. H/3 is often 
the weakest line among the four lines used in the BPT, 
and that limits the sensitivity of BPT to low-luminosity 
AGNs. The line ratios from the stacked spectra clearly 
show that these objects actually host AGNs. This is 
a strong proof that the Oxygen-excess method works 
well in identifying such low-luminosity AGNs. The X- 
ray and radio analyses presented below also shows that 
most of the objects are likely AGNs. We shall note that 
there has been a considerable debate as to whether low- 
luminosity, low-ionization objects (LlNERs; Heckman 
1980) are powered by AGNs. There are other energy 



sources proposed in the literature that can produce weak 
LlNER-like emission. We discuss those fake AGNs in 
Section 3.4. 

• O-Bn (18.1%): The stacked spectrum shows weak 
emission lines. Hj3 is stronger than [oill] and this class 
of objects unlikely host strong AGNs. The stacked ob- 
ject lies on the border of the threshold line in the BPT 
diagram. This suggests that, while the majority of the ob- 
jects do not host AGNs, we may have small contamina- 
tion of AGNs (see the radio analysis below). O-i-Bn and 
O-Bn are the classes for which we can apply the Oxygen- 
excess method only. The positions of these objects on the 
BPT diagram clearly demonstrate that our classification 
for such weak emission line objects is fairly reasonable. 

• OnBn (22.1%): A large fraction of all the objects do 
not show even weak emission lines and fall in this cate- 
gory. If we stack their spectra, very weak emission lines 
emerge. Most of the lines are comparable strength to 
typical noise level in the SDSS spectra. The line ratios 
in Fig. 8 indicate that LlNERs reside in such apparently 
quiescent galaxies. We also observe that a fraction of 
them is detected in radio despite their weak emission (see 
below). 

To sum up, for bright objects for which we can apply the 
BPT diagnostics, the Oxygen-excess method gives the consis- 
tent SF/AGN classifications for 85% of the objects. While 
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Fig. 9. L^Qjj^^^Qjjj] distributions of tlie Oxygen-excess objects 
(filled circle) and BPT AGNs (open circles). 



CQ. O 



o ° 





1 ' 


• 


























• Oxygen excess 






- o No Oxygen excess 


1 





-1.5 



-0.5 
logio([NII]/Ha) 



0.5 



Fig. 10. Distribution of X-ray sources on the BPT diagram. The curves 
and the contours are as in Fig. 6. The filled and open points show X-ray 
sources with and without Oxygen flux excess, respectively. 

the BPT diagnostics is applicable to 43% of all the objects, 
the Oxygen-excess method is applicable to 78%. This results 
in a higher fraction of the Oxygen-excess objects (26%) than 
BPT AGNs (11%). Fig. 9 illustrates our sensitivity to low- 
luminosity objects. The Oxygen-excess method can identify 
a significant number of low-luminosity objects that are missed 
by BPT. The stacked spectra show the average properties of ob- 
jects in each class. Except for a small fraction of the discrepant 
cases (Oh-B- and O-B-n, 7% of total), the stacked spectra on the 
BPT diagram show that the AGN/SF classifications are made 
well. In particular, low-luminosity objects for which the BPT 
is not applicable are classified fairly well. This is an encour- 
aging result and motivates us to perform a further test of the 
method using X-ray and radio data. 

3.2. X-ray sources 

We base our X-ray analysis on the CSC-SDSS cross-match 
catalog release 1.1 from the Chandra website^ (Evans et al. 
2010). The catalog has Chandra sources from archival data 



matched to SDSS objects. Due to the nature of its serendipity, 
the data depth varies across the sky and the data are not de- 
fined in any systematic way. However, the serendipity in turn 
allows us to perform a statistical test of our method because 
any specific types of Oxygen-excess objects do not prefer any 
particular patch of Chandra observations. 

We use SDSS objects that have point-like Chandra counter- 
parts within 2 arcsec from the center to ensure that we are look- 
ing at sources at the galaxy centers, while accommodating the 
PSF degradation at large angle from the focal axis. This re- 
sults in 56 X-ray sources, which are the subject of the analyses 
in this subsection. We have also cross-matched the archival 
XMM-Newton sources (Watson et al. 2009). We find that the 
conclusions in this section remain the same if we use the XMM 
sources, but there seems a slightly increased amount of non- 
AGNs sources possibly due to the poorer angular resolution of 
XMM compared to Chandra^. We thus use the Chandra data 
for the analysis here. 

We do not correct for absorption due to the Galactic neutral 
hydrogen because essentially all the objects have a low hydro- 
gen column density due to the Galaxy of a few times lO^^cm^^ 
(Dickey & Lockman 1990). Based on Morrison & McCammon 
(1983), we estimate that the absorption in the hard band (2- 
7keV) is negligible. The absorption is still small in the soft 
band (0.5-1.2keV) with an optical depth of a few times 10~^. 
The correction for intrinsic absorption requires sufficient X-ray 
photons and a detailed spectral analysis, which is beyond the 
scope of this work. However, an intrinsic absorption is likely 
below 10^^ cm^^ at the X-ray luminosity range we explore 
here (Mainieri et al. 2007). At this column density, a hard X- 
ray luminosity is largely unaffected by absorption. Our hard 
X-ray luminosity should therefore be reasonable estimates. 

First, we put the X-ray objects in the BPT diagram in Fig. 
10. Note that we cannot show X-ray detected O-nBn, O-Bn, and 
OnBn objects in the diagram due to their weak emission. Most 
of the X-ray sources are in the AGN region of the diagram. 
There are several sources in the star forming region, but most 
of them do not show any Oxygen excess. 

Due to the serendipitous nature of the X-ray catalog, we can 
compare the relative frequency of X-ray detections of various 
classes of objects defined in Table 1 . We normalize the X-ray 
detection frequency of the Bh-Oh- objects to unity and show 
the relative frequencies in Table 2. We also show the X-ray 
properties of the objects in Fig. 11. There are several en- 
ergy sources of X-ray emission: AGNs, supernova remnants, 
high/low-mass X-ray binaries, and hot thermal plasma in the 
halos. To quantify the dominance of AGNs in each panel, we 
define a region of the diagram where we expect contamina- 
tion of non-AGN sources as the dashed line. This definition 
is motivated by the fact that these non-AGN sources are likely 
soft, low-luminosity sources. For example, Irwin et al. (2003) 
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We have cross-matched 108 objects with the archival XMM sources 
within la positional error (but we impose a maximum separation of 3 
arcsec). We find that a larger fraction of objects have soft X-ray emis- 
sion, whose origin could both be AGN and non-AGN sources, compared 
to the Chandra sources. Although we restiict the sources to be consistent 
with PSF sizes, the galaxies optically extend on a comparable angular size 
to the resolution of XMM. We suspect that the XMM luminosities suf- 
fer from increased contamination from non-AGN sources such as X-ray 
binaries than Chandra. 
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Table 2. X-ray and radio detection rates in each class normal- 
ized to that of B-I-0-I-. Objects that could be contaminated by 
non-AGN X-ray/radio emission are excluded from the statistics (see 
text for details). Due to this conservative analysis, the num- 
bers here should not be regarded as purity of the AGN sam- 
ple. The numbers also suffer from X-ray/radio sensitivity limits 
and the fractions are naturally lower in lower luminosity AGNs. 





X-ray 


Radio 


+ B+ 


1.000 ±0.186 


1.000 ±0.035 


+ B- 


0.097 ±0.097 


0.240 ±0.028 


0-B+ 


0.242 ±0.108 


0.088 ±0.012 


0-B- 


0.015 ±0.011 


0.009 ±0.001 


O + Bn 


0.140 ±0.042 


0.258 ±0.010 


O-Bn 


0.012 ±0.012 


0.027 ±0.003 


OnBn 


0.040 ±0.020 


0.083 ±0.005 



showed that most of the low-mass X-ray binaries have a hard- 
ness ratio less than between the soft (0.3-l.OkeV) and hard 
(2-6keV) bands. The definition is further motivated to include 
most of the O-B- galaxies whose X-rays are likely star forma- 
tion or other non-AGN origins. But, this is still a somewhat 
arbitrary definition and should not be over-interpreted. AGNs 
may well populate in the dashed box. Note that we have ex- 
cluded these possible non-AGNs from the statistics in Table 2 
to be conservative. 

We discuss each class of objects in what follows. 

• Oh-Bh- : Most of the objects 80%) show Lx > 
10^^ erg s^^ with relatively hard spectra. They are likely 
AGNs. 

• Oh-B- : We have only one X-ray source, which is likely 
an AGN due to its high luminosity {Lx ^ 10"'^ erg s~^). 
We do not further discuss this class of objects due to the 
poor statistics. 

• O-B-H : Many of the objects in this class are relatively 
luminous X-ray sources with medium hardness. They 
are likely AGNs. Our method misses this class of ob- 
jects. 

• O-B- : These objects are likely star forming galax- 
ies. Most of them have soft, low X-ray luminosity. But, 
there are luminous, medium hardness sources, which are 
likely AGNs. The X-ray detection rate is only 2% of that 
of O-i-B-i- and such AGNs are very rare. 

• OH-Bn : These are AGN candidates identified by the 
Oxygen-excess method, but their emission lines are too 
weak to apply the BPT diagnostics. The distribution of 
objects in Fig. 1 1 is relatively similar to that of B-i-O-i-, 
and most objects in this class are likely AGNs. The 
X-ray detection rate is not as high as O-i-B-i-, but it is 
probably because AGNs in this class are weaker given 
the weak emission lines. X-rays clearly show that the 
Oxygen-excess method works well in identifying such 
weak AGNs. 

• O-Bn : We do not observe any significant Oxygen flux 
excess in those objects, and most of the X-ray sources 
in this class are indeed low-luminosity soft sources with 
an exception of a very hard source. As inferred from the 
stacked spectrum, a small fraction of objects in this class 
may be real AGNs. But, we note a very small X-ray 



detection rate in Table 2(1%). 
• OnBn : X-ray detections of these objects may be sur- 
prising as we do not observe any significant emission 
lines. There seems a sequence of soft X-ray objects with 
X-ray luminosities between ^ lO'^^ and ~ 10*° erg s~^. 
We find a clear correlation between the X-ray luminosi- 
ties of these objects and their stellar mass, which lends 
support to the low-mass X-ray binary origin (Kim & 
Fabbiano 2004). It may also be that diffuse thermal 
emission contributes to the observed X-ray luminosity 
(Flohic et al. 2006). There are a few sources with hard- 
ness ratio around 0. We exclude the possibility of the 
supernova and high-mass X-ray binary origin for these 
sources because the host galaxies are quiescent galax- 
ies with very little on-going star formation. Although 
we cannot be conclusive, their hardness ratios seem to 
suggest that they are unlikely due to low-mass X-ray bi- 
naries. They may possibly be obscured AGNs with very 
weak optical emission lines. 

The comparisons with X-ray sources give another quantita- 
tive estimate of the robustness of the Oxygen-excess method. 
Although X-rays suffer from contamination from non-AGN 
sources, the numbers from conservative analyses in Table 2 
and Fig. 1 1 suggest that the Oxygen-excess method separates 
AGNs from star forming galaxies well, although it does miss 
a fraction of AGNs (e.g., O-B-n). The SF/AGN classifications 
of low-luminosity objects for which BPT is not applicable (i.e., 
O-i-Bn and 0-B-) are good (Fig. 11), demonstrating its sensi- 
tivity to low-luminosity AGNs. 

3. 3. FIRST sources 

We turn our attention to radio sources. Radio wavelengths 
have also been used to identify (distant) AGNs (Miley & De 
Breuck 2008 and references therein). In this subsection, we 
use data from the FIRST survey (Becker et al. 1995; White et 
al. 1997) to study the properties of the Oxygen-excess objects. 

The SDSS objects and FIRST sources are cross-matched 
within 1 arcsec. We have confirmed that our results are not 
sensitive to the matching radius (a matching radius of 2 arc- 
sec gives the same results). This is simple positional match- 
ing and we may well miss extended radio sources such as jet 
lobes. However, such extended radio emissions are relatively 
rare (~ 10%; Lin et al. 2010) and they should not strongly al- 
ter our conclusions. We do not reject extended sources from 
the matching because AGN point sources may be buried under 
extended radio emission due to star formation due to the poor 
spatial resolution of FIRST. In total, we have 1,747 matches in 
our sample. We adopt f integrated from the FIRST catalog as 
radio power Note that our results remain essentially the same 
if we use fpeak- We apply the fc-correction to the radio power 
assuming a power-law spectrum of cx (Condon 1992). 

We plot in Fig. 12 the distribution of radio detected sources 
on the BPT diagram. The radio detected objects tend to spread 
around the AGN sequence and extend to the bottom of the SF 
sequence. This is is in contrast to X-ray sources, which are 
mostly located away from the Kauffmann et al. (2003) thresh- 
old curve as shown in Fig. 10. This is partly due to increased 
contamination from star formation activities in radio wave- 
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Fig. 11. Hardness ratio between soft (0.5 — 1.2 keV) and hard (2.0 — 7.0 keV) bands plotted against liard X-ray luminosity. The panels show subclasses 
of AGNs defined in Table 1. The dots are all the sources and the points are those in each class. The dashed rectangle on the bottom-left of each panel 
detines the region, in which we expect contamination of star forming galaxies or other non-AGN sources. The numbers in each panel shows the fraction 
of objects outside of the rectangle. 




Fig. 13. SFRs against radio power. The panels show each class of AGNs defined in Table 1 . The contours are all the radio sources and the dots are those 
in each class. The dashed curve is the SFR-radio power relation from Hopkins et al. (2003) shifted downwards by 0.7 dex (i.e., a factor of 5). Objects 
below the curve are likely dominated by AGNs and the numbers in each panel shows the fraction of objects below the curve. 
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Fig. 12. As in Fig. 10, but here we plot objects detected in the FIRST 
survey. 

lengths than in X-rays. 

Fig. 13 shows SFRs from the spectral fits against the radio 
power for each class of objects defined in Table 1 . A star for- 
mation sequence of galaxies can be seen at the top-left corner 
of the diagram. We shift the radio power - SFR relation from 
Hopkins et al. (2003) downwards by a factor of 5 to separate 
AGN-dominated and SF-dominated objects as shown by the 
dashed curve. A large fraction of Oh-Bh- objects are dominated 
by star formation, which is in contrast to X-rays (Fig. 11). This 
illustrates the elevated contamination from star formation in ra- 
dio wavelengths. But, we emphasize that not all the galaxies in 
the star forming region of the diagram are pure star forming 
galaxies and they may well host AGNs, whose radio emission 
due to AGN is weaker than that due to star formation. We re- 
move such galaxies from the analysis just to be conservative. 

Despite the significant contamination from star formation, 
the FIRST detection rates shown in Table 2 are fairly useful. 
As expected, Oh-Bh- and O-B- show the highest and lowest ra- 
dio detection rates, respectively. Interestingly, O+B- shows a 
relatively high detection rate. This is encouraging because it 
shows that not all the objects in this class are contamination 
and some of the O+B- objects are real AGNs. We comment on 
the other classes below. 

• O+Bn : As expected from their weak emission lines, 
galaxies in this class do not show strong on-going star 
formation. Most of the objects are not in the star forming 
region of the diagram and they are likely AGNs. We 
note that those in the star forming region occupy only the 
bottom half of the star forming sequence (compare with 
0-B-, for example), suggesting that even these galaxies 
have some level of AGN activities. We have shown from 
the stacking and X-ray analyses that this class of objects 
are mostly real AGNs. This radio analysis adds even 
further support to it. 

• O-Bn : Roughly 40% of the radio detected sources are 
likely AGNs. If we compare the panels for O+Bn and 
O-Bn, the classifications look reasonable, but it seems 
that we do missclassify a fraction of AGNs in this class 
as already suggested in the stacking analysis. However, 
the radio detection rate in Table 2 is fairly low and we do 



not regard this as a big issue. 
• OnBn : Some of the apparently quiescent galaxies 
host AGNs. The radio detection rate is 8% of 0+B+, 
which may be relatively high for such quiescent galaxies. 
This class of objects clearly shows that we cannot iden- 
tify all AGNs with optical methods only. Deep multi- 
wavelength data are essential to sample the entire AGN 
populations. 

The radio detections provide another interesting test of the 
Oxygen-excess method. While we do suffer from a level of 
contamination and incompleteness, the Oxygen-excess method 
overall works well. In fact, the radio detection rates in Table 2 
are the highest where we observe excess Oxygen luminosities. 

To summarize, we have compared the Oxygen-excess 
method with BPT, X-ray and radio. As mentioned at the beg- 
ging of this section, it is hard to quantify the purity and incom- 
pleteness of the Oxygen-excess method because no AGN iden- 
tification method gives a complete sampling of AGNs. But, 
the stacked spectra show that the SF/AGN classifications are 
on average very reasonable and the X-ray and radio analyses 
give a support to it. The results from these analyses all lead us 
to conclude that the Oxygen-excess method is a good statistical 
tool to identify AGNs. In particular, it is fairly sensitive to low- 
luminosity AGNs for which BPT is not applicable. It of course 
suffers from incompleteness (e.g., 0-B+) and contamination 
(e.g., a fraction of O+B- galaxies) as all the other methods do, 
but it classifies AGNs and star forming galaxies well. It is a 
powerful statistical tool to identify AGNs from spectroscopic 
data. 

3.4. Ionizing source — accreting material or evolved stars? 

Finally, we discuss contamination of objects that are photo- 
ionized by non-AGN sources. Most of the identified AGNs 
are low-luminosity, low-ionization objects (LINERs: Heckman 
1980) as quantified in Paper-II and there are several possi- 
ble origins of LINER-like spectra: low-ionization AGNs (Ho 
et al. 1993; Ho et al. 1997), shocks due to supernova/jets 
(Cox 1972; Heckman 1980; Dopita & Sutherland 1995; Dopita 
& Sutherland 1996), Wolf-Rayet stars (Terlevich & Melnick 
1985; Kewley et al. 2001), post-starburst (Taniguchi et al. 
2000), and post-AGB stars (Binette et al. 1994; Stasiriska 
et al. 2008; Sarzi et al. 2010; Cid Fernandes et al. 2011). 
LINERs likely constitute a heterogeneous class of objects (Ho 
2008). The existence of broad permitted lines in a fraction of 
LINERs provides strong evidence for the AGN origin (Ho et 
al. 1993; Ho et al. 1997), but relative contributions of the other 
ionizing mechanisms to the overall LINER population remain 
unclear 

Let us focus on the O+Bn class, in which more than 60% 
of the Oxygen-excess objects fall. These objects show only 
weak emission lines and therefore their underlying star for- 
mation activities are weak (otherwise we would have observed 
strong emission lines). For such objects, we can reject Wolf- 
Rayet stars and supernova as a primary cause of the Oxygen 
flux excess because these sources play a role only in star form- 
ing galaxies. The post-starburst origin is unlikely to produce 
such a large number of AGNs and unlikely a primary cause, 
too. Cid Fernandes et al. (2011) suggest that galaxies with 
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EW(Ha) < SA are likely due to post-AGB stars based on a 
photo-ionization calculation of a single burst stellar population. 
We find that a significant fraction (75%) of O+Bn objects have 
EW(Ha) < SA. But, we argue that many of them are actu- 
ally AGNs. We find that two-thirds of O-nBn objects detected 
in X-rays (excluding those in the dashed region of Fig. 11) 
have EW(Ha) < SA. We obtain a fairly similar number for 
those detected in radio (67%; again, those dominated by star 
formation are excluded from the statistics). Recently, Capetti 
& Baldi (201 1) also reported on a radio detection of quiescent 
galaxies in SDSS. These X-ray and radio detected sources are 
probably real AGNs and this suggests that EW(Hq:) < SA does 
not necessarily mean that the observed Oxygen-excess is due 
to post-AGB stars. The fact that a fraction of OnBn objects 
are detected in radios (Table 2) adds a further line of argument 
against EW(Ha)< 3 A. 

However, recent integral spectroscopy of nearby galaxies 
has revealed diffuse extended emission in early-type galax- 
ies and seem to provide evidence of contributions from post- 
AGB stars. Sarzi et al. (2010) observed that nearby early-type 
galaxies with radio detections show extended emission. These 
galaxies show LlNER-like emission line ratios and Sarzi et al. 
(2010) suggested that post-AGB stars can supply enough ion- 
izing photons to explain the observation and thus the photo- 
ionization is primarily due to post-AGB stars. However, an 
uncertainty in the fraction of ionizing photons that are repro- 
cessed into emission and our limited understanding of the last 
stage of the stellar evolution seem to hamper a firm conclu- 
sion. Extended line emission is also recently observed by Yan 
& Blanton (2011). They claimed that the ionizing parameter 
increases towards larger radii from the galaxy center and the 
most natural explanation of it would be due to post-AGB stars, 
although they found that post-AGB stars cannot supply enough 
ionizing photons. One can turn the question around and ask 
whether AGN can supply enough photons to explain the ob- 
served emission lines. Maoz et al. (1998) and Eracleous et 
al. (2010b) reported that a fraction of LlNERs show a severe 
deficit of ionizing photons and need other ionization sources. 
From these observations, there is no doubt that the Oxygen- 
excess objects are contaminated by the non-AGN emission and 
at least a fraction of observed emission line luminosities is 
likely due to post-AGB stars. 

However, the contributions from post-AGB stars may not be 
very significant. We will discuss in depth in Section 3.3 of 
Paper-11, but we briefly outline our argument here. We observe 
a clear correlation between optical emission line luminosities 
and hard X-ray luminosities (Fig 3 of Paper-11). Post-AGB stars 
are not luminous in hard X-rays and the hard X-ray luminosity 
is a good measure of AGN activity. On the other hand, optical 
emission lines can be significantly contaminated by post-AGB 
photo-ionization. The observed clear correlation suggests that 
the contribution from post-AGB stars to the observed optical 
emission is not severe. The hard X-ray detected sources have 
typical properties of the Oxygen-excess objects (they have stel- 
lar mass of > 10^" and are mostly red galaxies with low SFRs 
of < O.IM0 yr~^), thus they represent the Oxygen-excess ob- 
jects well. One might worry that non-AGN sources such as 
low-mass X-ray binaries may be contributing to the observed 
X-rays because these binaries often significantly contribute to 



the overall X-ray emission in massive quiescent galaxies. But, 
we show in Fig. 5 of Paper-II that the hard X-ray luminosity 
does not correlate with stellar mass. The rather weak depen- 
dence of X-ray luminosity on the host galaxy mass is also ob- 
served by other authors (Mullaney et al. 2012; Aird et al. 201 1). 
This excludes the low-mass X-ray binary origin of the observed 
X-rays because their contribution should increase with increas- 
ing stellar mass (Kim & Fabbiano 2004). The most likely ori- 
gin of the hard X-ray is therefore AGN and the clear correlation 
between hard X-ray and optical luminosity shown in Paper-11 
suggests that the contamination from post-AGB stars to the ob- 
served emission line is not significant. 

The X-ray sample is a just small portion of the entire 
Oxygen-excess objects. We make another subsample of them 
to further quantify the role of post-AGB stars. This is a sub- 
sample of quiescent (i.e., SFR is zero) Oxygen-excess objects 
in a narrow redshift slice to eliminate any redshift effects. The 
emission due to post-AGB stars should correlate strongly with 
stellar mass contained within the area covered by the fibers, 
while the AGN emission is unlikely to be strongly correlated 
with mass (Aird et al. 201 1; MuUaney et al. 2012). Within the 
narrow redshift slice, we find that observed emission line lu- 
minosities only weakly depend on stellar mass, which suggests 
that post-AGB stars do not significantly contribute to the over- 
all emission. Based on a very simple model, we find that 23% 
of the emission is due to post-AGB stars in typical Oxygen- 
excess objects with stellar mass of 10^" M0 within the fibers. 

It seems that a large fraction of O-nBn objects are likely 
AGNs. But, we still do not know the exact abundance of non- 
AGNs and the exact fractional contribution of the post-AGB 
photo-ionization to the observed emission line luminosities. 
We may well have Oxygen-excess objects whose emission is 
completely powered by post-AGB stars. It has been a chal- 
lenging task to pin down the abundance of true/false AGNs (Ho 
2008) and this would probably require deep multi-wavelength 
observations of the nuclear region of well defined sample of 
galaxies. The SDSS fibers subtend 3 arcsec on the sky and 
they include a substantial fraction of bulge and disk compo- 
nents under the typical seeing conditions of the site (~ 1.5 arc- 
sec). We deem that the SDSS data are not suited to address 
the issue in depth. Also, our poor understanding of the last 
phase of the stellar evolution puts further limit on our ability 
to constrain the role of post-AGB stars. For these reasons, we 
do not try to go further from here. This unknown fraction of 
contaminating non-AGNs (although it will be small) remains 
one of the major uncertainties in results presented in Paper-II 
and a more detailed study of both observational and theoretical 
aspects would be needed to put a more stringent constraint on 
the role of non-AGN photo-ionization. 

4. Summary 

We have developed a novel technique to identify AGNs 
based on a very simple idea of comparing expected and ob- 
served emission line luminosities. We perform a spectral fits 
of the SDSS galaxies to obtain SFRs and dust extinction, from 
which we can compute expected emission line luminosities due 
to star formation. By comparing the expected luminosities with 
observed luminosities, we can statistically identify AGNs. In 
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this comparison, we use [oil] + [oill] luminosities. This choice 
is motivated by the fact that AGNs span a wide range in ioniza- 
tion state as quantified in Paper-II. 

To test the newly developed method, we have made exten- 
sive comparisons with the other AGN identification methods, 
namely, BPT, X-rays and radio. Our method suffers from con- 
tamination and incompleteness as all the other methods do. 
But, the average properties from the stacked spectra and the 
detection rates of X-ray and radio sources all suggest that the 
Oxygen-excess method is a good statistical method to identify 
AGNs. The most unique feature of the method is its sensi- 
tivity. We have demonstrated that our method is applicable 
to ~ 80% of the galaxies, while BPT can be applied to only 
~ 40%. All the analyses above show that the Oxygen-excess 
method works fairly well in identifying such low-luminosity 
objects. Another very unique feature, which we have not em- 
phasized in this paper, is its capability to subtract star formation 
component from the observed emission line luminosity to ex- 
tract pure AGN emission, which is crucial to characterize AGN 
activities. We will make an extensive use of these features to 
study the nature of low-luminosity AGNs in Paper-II. 

To summarize, the strengths and weaknesses of our method 
would be: 
STRENGTHS: 

• It requires only a sum of [Oil] and [Olll]. Note that the 
BPT diagnostics requires 4 lines and it takes ratios of 
the lines, meaning that one needs to detect each line at 
a sufficiently significant level. Hf3 is often the weakest 
line in AGNs and that limits the sensitivity of BPT. 

• It does not require Ha, allowing us to go up to z ~ 1 
with optical spectrographs. If one uses [Oil] only at a 
risk of missing high ionization AGNs, one can go up to 
even higher redshifts of 2 ~ 1.7. 

• It is fairly sensitive to low-luminosity AGNs that cannot 
be identified by the BPT diagnostics. 

• It is able to subtract emission line fluxes due to star for- 
mation and extract AGN fluxes. We will make a full use 
of this feature in Paper-II. 

WEAKNESSES: 

• It requires a robust continuum detection in well- 
calibrated spectra (i.e., this method is applicable only to 
bright galaxies). But, we deem that multi-wavelength 
photometry could be used when continuum spectra are 
not available. 

• It misses a fraction of AGNs selected from the BPT di- 
agnostics. X-rays, and radio. 

• It suffers from contamination of star forming galaxies. 
But, as emphasized throughout this paper, all methods 
suffer from incompleteness and contamination. 

• It misses weak AGNs in actively star forming galaxies as 
we will fully quantify in Paper-II. 
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Appendix 1. Extinction law and non-solar metallicity 
models 

In this appendix, we justify the choice of the Cardelli et al. 
(1989) extinction curve over the Chariot & Fall (2000) curve. 
We also justify the exclusion of non-solar metallicity models. 

The reason why we prefer the Cardelli et al. (1989) curve 
is because it gives better agreement between SFRs from the 
spectral fits and those from Ha corrected for the extinction. As 
shown in the left plot of Fig. 1, our SFR estimates are reason- 
ably accurate, although there is a tilt. This tilt becomes slightly 
larger if we adopt the Chariot & Fall (2000) extinction curve as 
shown in Fig. 14. Also, the mean offset between SFRs from the 
spectral fits and those from Ha and H/3 increases. Furthermore, 
we find that the discrepancy between the dust extinctions from 
the fits and those from the balmer decrement becomes larger 
if we use Chariot & Fall (2000). Because we extensively use 
SFRs and dust estimates from the spectral fits, we prefer to 
use the Cardelli et al. (1989) extinction curve to obtain better 
SFRs and dust estimates. It might appear at odds to change 
only the extinction curve of Chariot & Fall (2000), while keep- 
ing the two component dust model unchanged. But, the two 
component model is physically sensible and the observed bet- 
ter agreement with SFRs and dust justifies the modification of 
the extinction curve. We further note that Chariot & Fall (2000) 
calibrated their model parameters using starburst galaxies, but 
a very small portion of our sample is undergoing such activ- 
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Fig. 15. Stellar mass distributions. The open, shaded, and filled his- 
tograms show galaxies fit with super-solar, solar, and sub-solar metal- 
licity models, respectively. 

ities. It is interesting note that we can predict emission line 
fluxes equally well even if we adopt the Chariot & Fall (2000) 
model. It is likely due to the degeneracies between SFR and 
dust extinction 

Next, we turn to the metallicity dependence of our model 
fits. We generate the model templates with super-solar, solar, 
and sub-solar metallicities and perform the spectral fits using 
all the templates. Fig. 15 shows the distribution of stellar 
mass of galaxies. Our model fits do not reproduce the mass- 
metallicity relation (more massive galaxies are more metal- 
rich, e.g., Nelan et al. 2005), but the stellar mass distributions 
in Fig. 15 do not show any strong dependence of stellar mass 
on metallicity. Fig. 15 shows that we cannot estimate metal- 
licity of galaxies from our spectral fits. This would not be too 
surprising because metallicity estimates require careful absorp- 
tion line diagnostics to break the age-metallicity degeneracy 
(Worthey 1994). Furthermore, sub/super solar metallicity mod- 
els degrade the accuracy of our emission line flux estimates (a 
significant amount of galaxies have Ha/H/? < 2.86) due to in- 
accurate corrections for stellar absorption. For these reasons, 
we use only the solar metallicity models in our spectral anal- 
yses. We note that Asari et al. (2007) obtained a correlation 
between stellar metallicity and gas-phase metallicity from ver- 
satile spectral fits albeit with a significant scatter. 

Appendix 2. Veilleux & Osterbrock diagrams 

Veilleux & Osterbrock (1987) extended the commonly used 
BPT diagnostics and showed that the [oi] and [sil] lines are 
also sensitive to the presence of AGNs. We have made ex- 
tensive comparisons between the Oxygen-excess and BPT di- 
agnostics, but here we make further comparisons with the 
Oxygen-excess and Veilleux & Osterbrock (1987) diagrams. 

We present the distributions of Oxygen-excess on the 
Veilleux & Osterbrock (1987) diagrams in Figs. 16 and 17. 
The distributions of all galaxies do not show a clear branch of 
AGNs like the one seen in Fig. 6. As mentioned in the main 
body of the paper, the clear AGN sequence in the BPT diagram 
is likely due to the secondary nature of nitrogen. Theoretical 



modeling of the Veilleux & Osterbrock (1987) diagrams shows 
a strong overlap between star forming galaxies and AGNs on 
these diagrams (Stasihska et al. 2006). In fact, the Oxygen- 
excess objects spread over both the star forming and AGN re- 
gions of the diagrams. 

Fig. 18 shows the locations of the stacked objects on the 
Veilleux & Osterbrock (1987) diagrams. The trend is simi- 
lar to what observed in the BPT diagram (Fig. 8), albeit with 
a larger degeneracy between star forming galaxies and AGNs. 
On these diagrams, only O-i-B-i-, O-i-Bn, and OnBn are clearly in 
the AGN region. The other classes are fairly close to each other 
and are all in the star forming region. But, as mentioned above, 
this does not necessarily mean that they do not host AGNs be- 
cause the star forming sequence and AGN sequence overlap on 
these diagrams. 

The distribution of X-ray objects is shown in Fig. 19. Many 
of the sources are in the AGN region of the diagram, but a frac- 
tion of AGNs is scattered to the star forming region. This may 
appear in contrast to the BPT diagram in Fig. 10, where we 
have observed that most X-ray sources are in the AGN region 
of the diagram. Radio sources are shown in Fig. 20. The radio 
objects spread over the diagram, but objects that do not show 
any Oxygen excess tend to lie at the bottom-left tip of the star 
forming sequence. These objects are likely actively forming 
stars and the radio emission is due to star formation. 

Overall, the Veilleux & Osterbrock (1987) diagrams show 
consistent results with BPT, although their sensitivity to inter- 
mediate classes is limited. As discussed in the main body of the 
paper, most of the Oxygen-excess objects are in the O-i-Bn class 
and the figures presented in this appendix show that O-i-Bn ob- 
jects are clearly in the AGN region of the diagrams. This adds 
further evidence that most of the Oxygen-excess method works 
fairly well in identifying AGNs. 

Appendix 3. Host galaxy properties of O-Bh- objects 

The Oxygen-excess method misses a fraction of BPT AGNs 
(0-B-I-) due possibly to active underlying star formation. It 
could also be due to strong featureless continuum from AGN, 
which affects our SFR estimates. We cannot easily distinguish 
these two possibilities as discussed in the main body of the 
paper. We thus do not try to characterize their AGN activities 
and host galaxy SFRs. Instead, we quantify their stellar mass, 
color, and morphological types of the hosts and show that these 
missing AGNs do not change our conclusions in Paper-11. 

Fig. 21 shows the O-B-i- fraction as a function of stellar 
mass of the host galaxies. The O-B-i- objects typically have 
IQio-ii ^jjj jjjg fraction is low at the most massive and 
least massive ends. As we show in Paper-11, the fraction of the 
Oxygen-excess objects is nearly 60% at 10^^ M0. The fraction 
of the missing AGNs is an order of magnitude lower and they 
do not affect our results in any significant way. 

We plot the color and morphology distribution in Fig. 22. As 
detailed in Section 4 of Paper-11, the color is fc-corrected rest- 
frame li — r color and the morphology is characterized with the 
inverse concentration index (Shimasaku et al. 2001; Strateva 
et al. 2001) measured in the z-band. The O-B-i- objects tend 
to have intermediate color and morphological types, which is 
similar to the overall properties of the BPT AGNs (see paper- 
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Fig. 14. As in Fig. 1, but liere we assume tlie Cliarlot & Fall (2000) extinction curve. Plotted are a small subset (~ 20, 500 objects) of the entire sample. 
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Fig. 16. Same as in Fig. 6, but here we use [OI]/Ha in place of [NII]/Hci. The dashed line to separate star forming galaxies from AGNs is from Kewley 
et al. (2001). 
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Fig. 17. Same as in Fig. 6, but here we use [sn]/Ha in place of [NII]/Hq. The dashed line to separate star forming galaxies from AGNs is from Kewley 
et al. (2001). 




Fig. 18. Same as Fig. 8, but for the two Veilleux & Osterbrock (1987) diagrams. The locations of the stacked objects are indicated by the points and 
arrows. The filled and open circles are measured from the inverse-variance weighted stacking and from the median stacking, respectively. 
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Fig. 19. As in Fig. 10, but for the two Veilleux & Osterbrock (1987) diagrams. 
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Fig. 20. As in Fig. 12, but for the two Veilleux & Osteibrock (1987) diagrams. 
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Fig. 21. Fraction of 0-B+ objects as a function of stellar mass. 

II). They avoid the red sequence, which shows that they are 
undergoing star formation. 

Overall, properties of the 0-B+ objects are similar to the 
BPT AGNs in general. We have confirmed that our conclu- 
sions in Paper-II remain unchanged if we include these missing 
population in the analysis. 
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